Hungarian named entity recognition with a maximum entropy approach

نویسندگان

  • Dániel Varga
  • Eszter Simon
چکیده

In the analysis of natural language text a key step is named entity recognition, finding all complex noun phrases that denote persons, organizations, locations, and other entities designated by a name. In this paper we introduce the hunner open source language-independent named entity recognition system, and present results for Hungarian. When the input to hunner is already morphologically analyzed, we apply the system together with the hunpos morphological disambiguator, but hunner is also capable of working on raw (morphologically unanalyzed) text.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

Maximum Entropy Approach based Named Entity Recognition in Punjabi Language

Named Entity Recognition is the task of identifying and classifying named entities into some predefine categories like person, location, organization etc. NER is used in many applications like text summarization, text classification, question answering and machine translation systems etc. For English a lot of work has already been done in the field of NER, where capitalization is a major key fo...

متن کامل

Named Entity Recognition: A Maximum Entropy Approach Using Global Information

This paper presents a maximum entropy-based named entity recognizer (NER). It differs from previous machine learning-based NERs in that it uses information from the whole document to classify each word, with just one classifier. Previous work that involves the gathering of information from the whole document often uses a secondary classifier, which corrects the mistakes of a primary sentencebas...

متن کامل

Ranking Algorithms for Named Entity Extraction: Boosting and the Voted Perceptron

This paper describes algorithms which rerank the top N hypotheses from a maximum-entropy tagger, the application being the recovery of named-entity boundaries in a corpus of web data. The first approach uses a boosting algorithm for ranking problems. The second approach uses the voted perceptron algorithm. Both algorithms give comparable, significant improvements over the maximum-entropy baseli...

متن کامل

ME-CSSR: an Extension of CSSR using Maximum Entropy Models

In this work an extension of CSSR algorithm using Maximum Entropy Models is introduced. Preliminary experiments to perform Named Entity Recognition with this new system are presented.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Acta Cybern.

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2007